Statistics of Largest Loops in a Random Walk 
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We report further findings on the size distribution of the largest neutral segments in a sequence 
of N randomly charged monomers [D. Erta§ and Y. Kantor, Phys. Rev. E53, 846 (1996)]. Upon 
mapping to one-dimensional random walks (RWs), this corresponds to finding the probability dis- 
tribution for the size L of the largest segment that returns to its starting position in an Y-step RW. 
We primarily focus on the large N , £ = L/N <^ 1 limit, which exhibits an essential singularity. We 
establish analytical upper and lower bounds on the probability distribution, and numerically probe 
the distribution down to £ m 0.04 (corresponding to probabilities as low as lO"'^^) using a recursive 
Monte Carlo algorithm. We also investigate the possibility of singularities at ^ = 1/A; for integer k. 



PACS Numbers: 02.50.-r,05.40.+j 



I. INTRODUCTION 



It has recently been shown that ground state con- 
formations of polyampholytes, a particular type of het- 
eropolymers built with a random mixture of positively 
and negatively charged groups along their backbone, are 
extremely sensitive to the their total (excess) charge Q. 
A detailed study of the Q-dependence of the radius of gy- 
ration Rg determined that a reasonable compromise 
between stretching (which minimizes the electrostatic en- 
ergy) and remaining compact (which gains in condensa- 
tion energy) is for the polyampholyte to form a necklace 
of weakly charged blobs connected with highly charged 
"necks" , by taking advantage of the charge fluctuations 
along the chain. The results of Monte Carlo ^ and ex- 
act enumeration |^ studies qualitatively support such a 
picture. 

While the exact treatment of electrostatic interactions 
is not possible, we can pose a simplified problem which, 
we hope, captures some essential features of this necklace 
model. For example, we may ask what the typical size 
of the largest neutral (or weakly charged) segment in a 
random sequence of N charges will be. In order to an- 
swer this question, we investigated the size distribution 
of the largest neutral segments in polyampholytes with 
N monomers (A'^— mers). This problem can be mapped 
to a one-dimensional random walk (RW) : the sequence of 
charges {qi} {i — 1, . . . , N; qi = ±1) corresponds to an 
A^-step walk uj = {qi, ■ ■ ■ , qN} with the same sequence 
of unit steps in the positive or negative directions along 
an axis, where the probability of going up or down is 
equal to 1/2 at each step. Fig. |l| depicts an example 
of such a sequence and the corresponding path, where 
Si{u}) = 1j is position of the path at index 

i. {So{u!) = 0.) A segment of L monomers with zero 
total charge thus corresponds to an L-step loop inside 
the RW. In this paper, we further investigate properties 
of the probability Pjv (L) that the largest loop in an A^- 
step RW has length L, or, equivalently, the probability 



Zn{L) = Ei'io Pn{L') that all loops in an A^-step RW 
are shorter than L. Earlier results about a generalized 
version of this and other related problems can be found 
in Refs. 1^,^. In the continuum (A^ oo) limit, it is 
more convenient to work with the probability density 



Pn{L + 1)] 



and 



zi£) 



d£'p{£'), 



(1) 



(2) 



where £ = L/N is the appropriate scaling variable for 
this problem. 




FIG. 1. Example of a sequence lj with A = 14 charges, 
and the corresponding walk depicted by Si{uj). In this case, 
the longest loops have lengths L = 10 (dotted lines). 

There is an apparent simplicity of the formulation of 
the problem, i.e. it is similar (and related) to the classi- 
cal RW problems 1^ , such as the problem of first passage 
times or the problem of last return to the starting point, 
for which probability distributions can be computed ex- 
actly by using the method of reflections ||], and obey 
the same scaling in the continuum limit. However, the 
search for the longest loop of the RW, among all possi- 
ble starting points, creates a more complicated problem. 
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In its essence, the problem is more related to the statis- 
tics of self-avoiding, rather than regular, random walks. 
This relation becomes more transparent in the £ ^ 1 and 
£ limits. The former limit had been extensively stud- 
ied in Ref. and the latter will be discussed in Sec.|l|. 
The "self-interacting nature" of the problem can be seen 
even more clearly in its generalizations to arbitrary space 
dimension d, where many analogies between this problem 
and the self-avoiding walks exist. 

Our earlier investigations revealed remarkable prop- 
erties of the probability density p{£): It diverges as 
p{£) ~ l/Vl — £ for £ ^ 1, and has a discontinuous 
derivative at £ — 1/2. Furthermore, it has an essen- 
tial singularity at ^ = of the form p{£) ^ exp{— B / £) . 
An analytical solution in this limit still remains elusive. 
We had not been able to determine p(£) even numeri- 
cally below £ K 0.15 due to the very small probabilities 
involved near £ = 0, severely limiting a straightforward 
Monte Carlo approach. Because of these difficulties, the 
existence and precise form of this singularity (including 
possible power law prefactors etc.) was not well estab- 
lished. Since the publication of that work, we have devel- 
oped an improved Monte Carlo algorithm that is capable 
of probing significantly smaller values of £ numerically. 
Combined with strict analytical bounds on z{£), the re- 
sults strongly favor the existence of this singularity, and 
the proper form of the £ — > limit can be determined 
with high precision. In this paper, we report the results 
of these complementary findings. 

It should be noted that similar behavior is exhibited 
by extremal properties of a number of random processes, 
such as a one-dimensional random cutting process |^] 
(which can be generalized to higher dimensions [^) and 
return times in a random walk These models ex- 
hibit singularities at ^ = 1/fc, which become progres- 
sively weaker as the integer k is increased, leading to an 
essential singularity at £ = Q. Although it was claimed 
that our problem falls into the same category and there- 
fore should exhibit singularities at £ = 1/2, 1/3, 1/4, ••• 

, we believe that it differs from these models in a way 
that undermines the reasoning for this claim, as we shall 
discuss in Sec. In particular, we have numerically 
verified that the suggested singularity at £ — 1/3 does 
not exist, unless it has a very small prefactor. 

The rest of the paper is organized as follows: First, we 
establish upper and lower bounds on z{£). We then de- 
scribe an efhcient Monte Carlo algorithm that enables us 
to determine z{£) down to very small values, and present 
results from its implementation. Finally, we discuss the 
possible relevance of other random models with similar 
characteristic properties. 



II. UPPER AND LOWER BOUNDS 

In this section, we establish rigorous upper and lower 
bounds on the probability distribution z{£), both of 



which have the same functional form. The existence 
of these bounds significantly restrict possible asymptotic 
forms of z{£) in the £ ^ limit. 

The main strategy is the similar for establishing both 
upper and lower bounds. Walks whose largest loops are 
much smaller than their overall length are typically very 
biased in one direction, and sections of the walk that are 
separated by more than the largest loop size are very 
weakly correlated. For a given (small) value of £, let 
us divide each walk into roughly 1 /£ segments of similar 
size. There are necessary conditions that each segment 
must satisfy independently for the overall walk to con- 
tribute to z{£). If the probability for a random segment 

to satisfy these conditions is pn, then z{£) > pl/^ . Sim- 
ilarly, each segment can be designed to satisfy certain 
conditions that are sufficient to ensure that the overall 
walk contributes to z{£). If the corresponding probabil- 
ity for these conditions is Ps, then z{£) < pV^. The rest 
of this section is devoted to establishing a set necessary 
and sufficient conditions and calculating the correspond- 
ing probabilities. 

Let us first investigate necessary conditions. Let uj be 
an TV-step walk whose largest loop is less than L-steps 
long, and has Sn{(^) > 0. We shall focus on the cases 
where m = N/L is an integer for now. Let us split uj into 
m mutually exclusive segments {loi, • • • , uj„i} of length L 
where uJi — • • • , qih}- It is easy to see that uj 

satisfies the inequalities 

SiL{uj) > 5'(i-i)L(w), <i<m, (3) 

or, equivalently, 

Sl{uj.,) > 0, < z < m, (4) 

i.e. each of the m segments need to have a positive dis- 
placement. The probability for this is just Pn — 1/2, 
and therefore Zpf{N/m) < 2^~™ (the additional factor 
of 2 comes from RWs with Sn < 0). Consequently, 
Zjv(i) < 22-(^/-f') for any value of N and L. This estab- 
lishes a strict upper bound, which is significant for small 
values of £: 

z{£) < Aexp{-\ii2/£). (5) 

It is possible to further improve on this upper bound, and 
we will next demonstrate such an improvement which is 
by no means final. Consider a pair of adjacent segments 
(e.g. LUi and 102) described above, with Sl{oji), Sl{'jJ2) > 
0. Let i be the smallest index where 5*^(^1 ) ~ S'l(wi), 
and j the largest index where Sj{uj2) — 0. In that case, 
the segment from i to L + j (on co) is a loop, and there- 
fore i > j since lo cannot have a loop larger than L. For 
two randomly selected segments, this condition is satis- 
fied with probability 1/2, which can be calculated from 
the known probability distribution of "last return to the 
origin" [QH]. Since there are m/2 statistically indepen- 
dent adjacent pairs, this observation further suppresses 
the upper bound on the probability distribution by a fac- 
tor of 2^™/^, improving the overall upper bound to 
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z(£) < 4\/2exp(-1.51n2/^), 



(6) 



which makes the best (so far) analytical lower bound on 
the exponential factor B > 1.5 ln2 1.03972. 

In order to find a lower bound on the probability dis- 
tribution, let us again consider the sequence lo and its m 
pieces {coi} of length L each. We'd like to construct each 
uji independently in such a way to guarantee that the re- 
sulting walk to does not have loops larger than L. This 
can again be done in many different ways, and the fol- 
lowing is by no means optimal. The quality of the bound 
usually depends on how complicated the specifications of 
each piece are, and the limiting factor seems to be the an- 
alytical tractability of the associated probabilities. The 
following represents the best bound we have been able to 
establish analytically. 

The specifications of each piece is as follows: 



~a < Si < Sl 
a < Si < Sl + 



0<i< L/2, 
L/2 <i< L. 



(7) 



Figure ||(a) shows these specifications graphically. 
Clearly, Sl > 2a is required. Figure ||(b) shows how 
the joining of such pieces results in a sequence lu that 
has no loops larger than L. 
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FIG. 2. (a) Example of a walk that satisfies the conditions 
in Eq.(^). Each such walk remains entirely within the shaded 
area, (b) When such walks are joined together, the resulting 
walk does not have loops that are larger than or equal to L, 
since such loops cannot fit in the shaded area. 

The probability ps of meeting the stated specifications 
can be evaluated numerically to high accuracy using the 



method of reflections Q and summing over all possible 
values of S'i/2 and Sl for a given a. The largest value for 
the probability yields the tightest lower bound on z{£), so 
it is desirable to tune a in order to optimize the bound. 
We pick a — 0.5\/L, which is very close to the optimal 
value. In that case, the probability for a RW to satisfy 
the requirements (|^) for large L is Ps ~ 0.031585. This 
yields 

z{e) > 2ps exp(- Inpje) « 0.063176-^-^^^/^. (8) 

Clearly, neither the upper nor the lower bounds we have 
established are very tight, and they do not rule out 
the possibility of a power-law prefactor. However, there 
is very convincing numerical evidence that there is no 
power law prefactor in z{£), i.e. that lim^^o = 
C exp{—B /£) where C and B are constants that are de- 
termined in the following section. 



III. NUMERICAL WORK 

In this section, we present numerical studies to deter- 
mine p{£) and z{£) in the £ 1 limit. As stated earlier 
a standard Monte Carlo method of determining p(^) 
from a random sample of all possible walks is ineffective 
at probing £ 0.15, since the probabilities become very 
small. A similar problem arises when it is necessary to 
randomly sample very large self-avoiding walks (SAWs) 
in two and three dimensions: The probability of generat- 
ing a SAW is exponentially small in its overall length, i.e. 
the probability of picking a SAW out of RWs of length 
< 1 scales as PsAwiN) ^ N'^e^'^^ , where a and 
7 are constants that depend only on the dimensionality 
of the SAW. A common way to circumvent this prob- 
lem is to build large SAWs recursively by joining smaller 
SAWs. This method significantly reduces the number of 
operations needed by completely eliminating its depen- 
dence on the leading exponential factor: The probability 
of creating a SAW of length N by joining two randomly 
selected SAWs of length A^/2 scales only as , and 
the number of operations needed to generate a randomly 
sampled SAW grows as eT(i°S2JV)V2 instead of e'^^. Of 
course, creating SAWs in one dimension is trivial, but 
the extension of this method to one-dimensional walks 
is still very useful for our problem, since creating RWs 
with very small loops is similar to creating SAWs [in fact 
PsAw{N) = ZAr(l)], and can be used to sample z{i) 
efliciently at small I. 

In this implementation of the algorithm, we start from 
pairs of RWs of length L (with nonzero total displace- 
ment) and join them, keeping only resultant walks whose 
largest loops are smaller than L. At the first level, this 
creates walks that contribute to Z2l{L)^ with equal prob- 
ability. We then iterate this process by pairing the resul- 
tant walks at each level. After the nth level, we end up 
with a representative sample of all walks that contribute 
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to Z2tL{L), which can then be used to determine a his- 
togram for the probabiUty distributions for < ^ < 2~". 

We also need keep track of the probabihty of success 
Rn at each level, which is given by 



RniL) = 



(9) 



in order to determine the overall normalization of the 
probability distributions. One big advantage of studying 
one-dimensional walks is that the probability of success 
RniL) actually becomes independent of n, i. e., in the 
continuum limit 



z{£) = R[z{2e)f, 1, 



(10) 



where R = limi^oo lim„_,oo Rn{L) is a nonzero constant. 
(For the one-dimensional SAW, the probability of success 
is just 1/2.) Typically, variations in _R„(L) were within 
statistical fluctuations (0.1 to 0.3%) for n > 3. When 
RniL) is independent of n, the number of operations 
needed to sample a representative walk that contributes 
to zi£) is only polynomial in which speeds up the al- 
gorithm enormously. Furthermore, this implies that for 
^< 1, 



z(£) = Cexp{~B/e}, 
BC 
1 



p{i) = —exp{~B/e}, 



(11) 
(12) 



where C = R~^ and B are constants; there are no power- 
law prefactors in z(^). This result can be verified numer- 
ically by looking at the results of the described recur- 
sive algorithm: Fig. |^ confirms the functional form (|l^ ) 
over about twelve decades in the probability density pit), 
probing values of £ down to 0.04 ||]. 




l/l 

FIG. 3. The probability density p{£) for 0.04 < £ < 1/2 
confirms the suggested form ( p^ ) down to probabilities as low 
as lO"^'^. The overall walk size is iV = 2048. Four (partially 
overlapping) plots were generated from runs that terminated 
after recursion levels 1 through 4. 



The constants C and B in the continuum limit can be 
determined accurately by plotting their dependence on 
walk length. C is simply the inverse of the success prob- 
ability R as mentioned earlier, whereas B is given by the 
slope of the graph in Fig ^. Fig. ^ shows these plots, 
which yield 



C = 4.57 ±0.01, 
B = 1.73 ±0.02. 



(13) 
(14) 



0.0005 
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0.005 
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FIG. 4. Size dependence of the constants that appear in 
the L/iV < 1 limit of Zjv(L) and P]v(L). Top: The expo- 
nential constant B{N) determined from plots of l^pit) as a 
function of total walk length A'^. Statistical errors are smaller 
than symbol sizes. Bottom: The prefactor C(I/) determined 
from success probabilities Ri{L) as a function of largest loop 
size L. Statistical errors are roughly the size of symbols. 



IV. RELATED PROBLEMS 

Behavior that is strikingly similar to those of pi£) are 
exhibited by probability distributions of extremal prop- 
erties in certain random systems. One simple example is 
a one-dimensional random cutting process ||^,^: A unit 
interval is cut at a randomly selected point (with uniform 
probability), and the same cutting process is repeatedly 
applied to the interval that remains to the right of the lat- 
est cut, ad infinitum. The probability distribution p'i£) 
for the size of the largest interval that remains at the end 
of the cutting process exhibits singularities of the form 
\£ — l/k\^~^ at each value of k, which become progres- 
sively weaker as the integer k is increased, leading to an 
essential singularity at £ = 0. The origin of these singu- 
larities can be traced to the fact that the pieces (among 
which the largest one is chosen) constitute a partition 
of the entire interval, which implies that the sum of the 
sizes of all pieces equals the size of the initial interval, 
which is 1. Consequently, any piece that is larger than 



4 



1/2 is necessarily the largest, and in general there can be 
at most A: — 1 pieces that are larger than \/k. This causes 
singular behavior in p'{i) at i = 1/k for all k. Similar 
"sum rules" apply to the all the other systems that are 
discussed in Ref. However, this property is not sat- 
isfied by our problem, since loops can and do overlap. 
We have numerically examined the vicinity of ^ = 1/3, 
and conclude that there are no singularities in the first 
and second derivatives of p{i) with a prefactor of 0(1). 
Although we cannot rule out the possibility of weaker 
singularities or unusually small prefactors, the evidence 
seems to suggest that they do not exist. 



V. CONCLUSION 

With the help of an efficient Monte Carlo algorithm 
and analytical upper and lower bounds, we have clarified 
some of the issues surrounding the behavior of the prob- 
ability density p{£) for small values of its argument, and 
we have been able to better understand and character- 
ize the essential singularity a,t £ — 0. In this limit, the 
connection of this problem to SAWs becomes much more 
transparent, and it is likely that this connection can be 
further exploited. 
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